Communication-Efficient Distributed SGD with Error-Feedback, Revisited

نویسندگان

چکیده

We show that the convergence proof of a recent algorithm called dist-EF-SGD for distributed stochastic gradient descent with communication efficiency using error-feedback Zheng et al. (NeurIPS 2019) is problematic mathematically. Concretely, original error bound arbitrary sequences learning rate unfortunately incorrect, leading to an invalidated upper in theorem algorithm. As evidences, we explicitly provide several counter-examples, both convex and non-convex cases, incorrectness bound. fix issue by providing new its corresponding proof, algorithm, therefore recovering mathematical analysis.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to its excellent scalability properties. A fundamental barrier when parallelizing SGD is the high bandwidth cost of communicating gradient updates between nodes; consequently, several lossy compresion heuristics have been proposed, by which nodes only communicate quantized gradient...

متن کامل

Error Exponents for Distributed Detection with Feedback'

We investigate the effects of feedback on a decentralized detection system consisting of N sensors and a detection center. It is assumed that observations are independent and identically distributed across sensors, and that each sensor compresses its observations into a fixed number of quantization levels. We consider two variations on this setup. One entails the transmission of sensor data to ...

متن کامل

Revisiting Distributed Synchronous SGD

Distributed training of deep learning models on large-scale training data is typically conducted with asynchronous stochastic optimization to maximize the rate of updates, at the cost of additional noise introduced from asynchrony. In contrast, the synchronous approach is often thought to be impractical due to idle time wasted on waiting for straggling workers. We revisit these conventional bel...

متن کامل

Communication-Efficient Distributed Statistical Inference

We present a Communication-efficient Surrogate Likelihood (CSL) framework for solving distributed statistical inference problems. CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation and Bayesian inference. For low-dimensional estimation, CSL provably improves upon naive averaging schem...

متن کامل

Communication Efficient Distributed Agnostic Boosting

We consider the problem of learning from distributed data in the agnostic setting, i.e., in the presence of arbitrary forms of noise. Our main contribution is a general distributed boosting-based procedure for learning an arbitrary concept space, that is simultaneously noise tolerant, communication efficient, and computationally efficient. This improves significantly over prior works that were ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computational Intelligence Systems

سال: 2021

ISSN: ['1875-6883', '1875-6891']

DOI: https://doi.org/10.2991/ijcis.d.210412.001